Synthesis of fast speech with interpolation of adapted HSMMs and its evaluation by blind and sighted listeners
نویسندگان
چکیده
In this paper we evaluate a method for generating synthetic speech at high speaking rates based on the interpolation of hidden semi-Markov models (HSMMs) trained on speech data recorded at normal and fast speaking rates. The subjective evaluation was carried out with both blind listeners, who are used to very fast speaking rates, and sighted listeners. We show that we can achieve a better intelligibility rate and higher voice quality with this method compared to standard HSMM-based duration modeling. We also evaluate duration modeling with the interpolation of all the acoustic features including not only duration but also spectral and F0 models. An analysis of the mean squared error (MSE) of standard HSMM-based duration modeling for fast speech identifies problematic linguistic contexts for duration modeling.
منابع مشابه
Comprehension of Ultra-fast Speech – Blind vs. "normally Hearing" Persons
This study explores how much speech can be temporally compressed and still understood by blind people who have daily practice with speech synthesis vs. sighted persons without such training. Three text modes were generated (formant synthesis, natural speech with and without pauses). These texts were presented to sighted listeners at rates between 9-14 s/s and to blind listeners between 17-22 s/...
متن کاملبررسی وضوح گفتار کودکان فلج مغزی اسپاستیک 8 تا 12 ساله
Background and purpose: Speech intelligibility refers to how speech is understandable by listeners. This study examined speech intelligibility in children (Persian native speakers) with spastic cerebral palsy aged 8-12 years old. Materials and methods: A cross-sectional study was performed in 31dysarthric students (….. boys and …..girls) in Tehran, 2014. A list of w...
متن کاملRelationship between postural abnormalities with quality of life and self efficacy of blinds and partially sighted people
Introduction: The purpose of this study was to investigate the relationship between postural abnormalities with the quality of life and self-efficacy of the blind and partially sighted people. Materials and Methods: 100 blind and partially sighted people (mean age 34 years old including 48 males and 52 females) under the supervision of the Association of the Blind and Partially Sighted of Arak ...
متن کاملIntelligibility analysis of fast synthesized speech
In this paper we analyse the effect of speech corpus and compression method on the intelligibility of synthesized speech at fast rates. We recorded English and German language voice talents at a normal and a fast speaking rate and trained an HSMMbased synthesis system based on the normal and the fast data of each speaker. We compared three compression methods: scaling the variance of the state ...
متن کاملModeling Acoustic Rendition of Documents’ Typography using Expressive Speech Synthesis for Sighted and Blind Users
The accessibility to printed and electronic documents (books, newspapers, web content) by the print disabled, as well as the moving users and the elderly, is based on the possibility to convert them (in real time) into acoustic and/or haptic modality. Besides its content, a printed or electronic text document contains a number of presentation visual elements that apply design glyphs or typograp...
متن کامل